2023
| Potential outcome | D=1 | D=0 | |
|---|---|---|---|
| \(Y(1)\) | \(Y\) | missing | |
| \(Y(0)\) | missing | \(Y\) |
Average Treatment Effect on treated
\[\tau_{\mathrm{att}} = \mathbb{E} [Y(1)-Y(0) | D = 1]\]
If we assume \(Y(0) \perp D\), this enables us to use \(\mathbb{E} [Y(0) | D = 0]\) in the place of \(\mathbb{E} [Y(0) | D = 1]\). In panel data setting other methods based on regression are available, e.g., propensity score matching.
Abadie and Gardeazabal (2003) Abadie, Diamond, and Hainmueller (2010)
The paper I have chosen to summarise is “INFERRING CAUSAL IMPACT USING BAYESIAN STRUCTURAL TIME-SERIES MODELS” by Brodersen et al. (2015).
In the most general sense Brodersen suggests using a Bayesian structural time-series model. This can be represented as,
\[ \begin{split} y_t &= Z_t^\intercal \alpha_t + \epsilon_t\\ \alpha_{t+1} &= T_t \alpha_t + R_t \eta_t. \end{split} \]
\[ y_t = \left[ \begin{matrix} 1 & 0 & x_t\\ \end{matrix} \right] \left[\begin{matrix} \mu_{t+1}\\ \delta_{t+1}\\ \beta_{t+1} \end{matrix}\right] + \epsilon_t, \]
\[ \left[\begin{matrix} \mu_{t+1}\\ \delta_{t+1}\\ \beta_{t+1} \end{matrix}\right] = \left[\begin{matrix} 1 & 0 & 1\\ 0 & 0 & 1\\ 0 & 1 & 0 \end{matrix}\right] \left[\begin{matrix} \mu_{t}\\ \beta_{t}\\ \delta_{t} \end{matrix}\right] + \left[\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 1 \end{matrix} \right] \left[\begin{matrix} \eta_{\mu,t}\\ \eta_{\delta,t}\\ \eta_{\beta,t} \end{matrix}\right]. \]
\[ p(\varrho,\beta,1/\sigma_\epsilon^2)=p(\varrho)p(\sigma_\epsilon^2|\varrho)p(\beta_\varrho|\varrho,\sigma_\epsilon^2) \]
\[\varrho\sim\text{Bernoulli}(.)\]
\[(\beta_\varrho|\sigma_\epsilon^2,\varrho=1)\sim\mathcal{N}(.)\]
\[(1/\sigma_\epsilon^2|\varrho=1) \sim \mathcal{G}(.)\]
\[ p(\varrho,\beta,1/\sigma_\epsilon^2)=p(\varrho)p(\sigma_\epsilon^2|\varrho)p(\beta_\varrho|\varrho,\sigma_\epsilon^2) \]
\[\varrho\sim\text{Bernoulli}(.)\]
\[(\beta_\varrho|\sigma_\epsilon^2,\varrho=1)\sim\mathcal{N}(.)\]
\[(1/\sigma_\epsilon^2|\varrho=1) \sim \mathcal{G}(.)\] Zellner’s g-prior
\[ \begin{split} y_t &= \beta_{t,1}x_{t,1}+\beta_{t,2}x_{t,2}+\mu_t+\epsilon_t\\ \beta_{t,i} &\sim \mathcal{N}(\beta_{t-1,i},0.01^2);\quad \beta_{0,i}=0; \quad i \in \{1,2\}\\ \mu_t &\sim \mathcal{N}(\mu_{t-1},0.1^2);\quad\mu_0=20\\ \epsilon_t &\sim \mathcal{N}(0,0.1^2). \end{split} \]
Brodersen et al. (2015) applies a multiplicative factor to imitate a causal effect so that the final observations are given by \(y^*_t=y_t \mathbb{I}\{t<366\}+y_t(1+e)(1-\mathbb{I}\{t<366\})\), where \(\mathbb{I}\{f(t)\}\) is the indicator function that evaluates to 1 when \(t \in \{w : f(w)\}\) and 0 otherwise.
\[ \beta_{t,i} \sim \begin{cases} \mathcal{N}(\beta_{t-1,i},\ 0.01^2) &\text{if } t < 366+90\\ \mathcal{N}(\beta_{t-1,i},\ c^2) &\text{otherwise}. \end{cases} \]